Matching Multi-lingual Subject Vocabularies

نویسندگان

  • Shenghui Wang
  • Antoine Isaac
  • Balthasar A. C. Schopman
  • Stefan Schlobach
  • Lourens van der Meij
چکیده

Most libraries and other cultural heritage institutions use controlled knowledge organisation systems, such as thesauri, to describe their collections. Unfortunately, as most of these institutions use different such systems, unified access to heterogeneous collections is difficult. Things are even worse in an international context when concepts have labels in different languages. In order to overcome the multilingual interoperability problem between European Libraries, extensive work has been done to manually map concepts from different knowledge organisation systems, which is a tedious and expensive process. Within the TELplus project, we developed and evaluated methods to automatically discover these mappings, using different ontology matching techniques. In experiments on major French, English and German subject heading lists Rameau, LCSH and SWD, we show that we can automatically produce mappings of surprisingly good quality, even when using relatively naive translation and matching methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Radon Transform in Detecting Turning Angle of Bodies and in Reading Multi - Lingual Documents

Recently, image processing technique and robotic vision are widely applied in fault detection of industrial products as well as document reading. In order to compare the captured images from the target, it is necessary to prepare a perfect image, then matching should be applied. A preprocessing must therefore, be done to correct the samples’ and or camera’s movement which can occur during the...

متن کامل

Application of Radon Transform in Detecting Turning Angle of Bodies and in Reading Multi - Lingual Documents

Recently, image processing technique and robotic vision are widely applied in fault detection of industrial products as well as document reading. In order to compare the captured images from the target, it is necessary to prepare a perfect image, then matching should be applied. A preprocessing must therefore, be done to correct the samples’ and or camera’s movement which can occur during the...

متن کامل

N-Gram Language Modeling for Robust Multi-Lingual Document Classification

Statistical n-gram language modeling is used in many domains like speech recognition, language identification, machine translation, character recognition and topic classification. Most language modeling approaches work on n-grams of terms. This paper reports about ongoing research in the MEMPHIS project which employs models based on character-level n-grams instead of term n-grams. The models ar...

متن کامل

Multi-lingual Features of the Unified Medical Language System

The Unified Medical Language System (UMLS) is a terminology integration system developed and maintained by the U.S. National Library of Medicine (NLM). Over the past 20 years, the UMLS Metathesaurus has been extended to encompass 168 source vocabularies. While English is the dominant language (116 source vocabularies), 52 vocabularies in the Metathesaurus are in languages other than English (6 ...

متن کامل

An API for Multi-lingual Ontology Matching

Ontology matching consists of generating a set of correspondences between the entities of two ontologies. This process is seen as a solution to data heterogeneity in ontology-based applications, enabling the interoperability between them. However, existing matching systems are designed by assuming that the entities of both source and target ontologies are written in the same languages ( English...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009